WHAT IS CLAIMED IS : 

1 . An application monitoring and disaster recovery management system, 
comprising: 

a primary computing environment, including a primary server executing an 
5 application; 

a secondary computing environment, including a secondary server capable of 
executing said application; 

a management server, executing a monitoring and management server module, 
that is in communications with said primary server and said secondary server; 
10 a graphical user interface, in communications with said monitoring and 

management server module, capable of allowing a user to a perform a failure switch- 
over from said primary computing environment to said secondary computing 
environment for said application in a single action; 

whereby said system allows for disaster recovery and fault tolerance, and limits 
15 computing down-time experienced by end-users of said primary computing 
environment. 

2. The system of Claim 1 , further comprising: 

a first plurality of intelligent agents distributed within said primary computing 
environment, wherein each of said first plurality of intelligent agents monitors a metric 
20 related to said application executing on said primary server. 

3 . The system of Claim 2, wherein each of said first plurality of intelligent 
agents are in communications with said monitoring and management server module, 
and said graphical user interface is capable of displaying the metric corresponding to 
each of said first plurality of intelligent agents. 
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4. The system of Claim 3, further comprising: 

a second plurality of intelligent agents distributed within said secondary 
computing environment, wherein: 

each of said second plurality of intelligent agents monitors a metric 
related to a subsystem within said secondary computing environment; 

each of said second plurality of intelligent agents are in communications 
with said monitoring and management server module; and 

said graphical user interface is capable of displaying the metric 
corresponding to each of said second plurality of intelligent agents. 

5. The system of Claim 1, further comprising: 

a primary data repository located within said primary computing environment 
and accessible by said primary server; 

a secondary data repository located within said secondary computing 
environment and accessible by said secondary server; and 

means for synchronizing data stored in said primary data repository and said 
secondary data repository in real time as new data are written to said primary data 
repository as said application executes. 

6. The system of Claim 5, wherein said means for synchronizing data 
comprises a communications link from said primary server to said secondary server. 

7. The system of Claim 5, further comprising: 

a plurality of archival data stores, each accessible by said secondary data 
repository, wherein each of said plurality of archival data stores is capable of storing 
a different point-in-time level snapshot of data stored in said secondary data repository. 

8. The system of Claim 5, further comprising: 
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a plurality of intelligent agents distributed within said primary computing 
environment, wherein each of said plurality of intelligent agents monitors a metric 
related to said primary data repository. 

9. The system of Claim 8, wherein each of said plurality of intelligent 
agents are in communications with said monitoring and management server module; 
and said graphical user interface is capable of displaying the metric corresponding to 
each of said plurality of intelligent agents. 

1 0. The system of Claim 1 , wherein said graphical user interface is further 
capable of allowing a user to perform a switch-back from said secondary computing 
environment to said primary computing environment for said application in a single 
action. 

1 1 . The system of Claim 1 0, wherein said single action is a button click by 
the user on said graphical user interface. 

1 2 . The system of Claim 1 , wherein said primary computing environment 
and said secondary computing environment are geographically dispersed. 

1 3 . The system of Claim 1 , wherein said primary and secondary computing 
environments, said management server and said graphical user interface are 
interconnected over at least a portion of the global, public Internet. 

14. A method for providing a user with an application monitoring and 
disaster recovery management tool, comprising the steps of: 

deploying a first plurality of intelligent agents within a primary computing 
environment, said primary computing environment including a primary server 
executing an application, and wherein each of said first plurality of intelligent agents 
monitors a metric related to said application; 
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monitoring, by a monitoring and management server module executing on a 
management server, a plurality of states, each of said plurality of states being rendered 
by one of said first plurality of intelligent agents; 

displaying to the user, via a graphical user interface in communications with 
5 said monitoring and management server module, said plurality of states; and 

performing a failure switch-over from said primary computing environment to 
a secondary computing environment having a secondary server capable of executing 
said application in response to a first input received from the user via said graphical 
interface; 

1 0 whereby said method allows for disaster recovery and fault tolerance, and limits 

computing down-time experienced by end users of said primary computing 
environment. 

1 5 . The method of Claim 1 4, wherein said application is an electronic mail 
application, and said failure switch-over comprises the step of temporarily changing 

15 the hostname of said secondary server to the hostname of said primary server. 

16. The method of Claim 1 4, wherein said primary computing environment 
and said secondary computing environment are geographically dispersed. 

17. The method of Claim 14, wherein said first input is received by said 
monitoring and management server module as a result of a button click by the user on 

20 said graphical user interface. 

18. The method of Claim 14, further comprising the step of: 
performing a switch-back from said secondary computing environment to said 

primary computing environment in response to a second input received from the user 
via said graphical interface. 
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19. The method of Claim 1 8, wherein said second input is received by said 
monitoring and management server module and as a result of a button click by the user 
on said graphical user interface. 

20. The method of Claim 14, further comprising the steps of: 
deploying a second plurality of intelligent agents within said secondary 

computing environment, wherein each of said second plurality of intelligent agents 
monitors a metric related to a subsystem within said secondary computing 
environment; 

monitoring, by said monitoring and management server module, a second 
plurality of states, each of said second plurality of states being rendered by one of said 
second plurality of intelligent agents; and 

displaying to the user, via said graphical user interface, said second plurality of 

states. 

2 1 . The method of Claim 1 4, further comprising the step of: 
synchronizing data stored in a primary data repository accessible to said 

primary server within said primary computing environment and a secondary data 
repository accessible to said secondary server within said secondary computing 
environment in real time as new data are written to said primary data repository as said 
application executes. 

22. The method of Claim 19, further comprising the step of: 
archiving data from said secondary data repository to one of a plurality of 

archival data stores in response to a second input received from the user via said 
graphical interface, wherein each of said plurality of archival data stores contains a 
different point-in-time level snapshot of data stored in said secondary data repository. 
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23. A computer program product comprising a computer usable medium 
having control logic stored therein for causing a computer to provide a user with an 
application monitoring and disaster recovery management tool, said control logic 
comprising: 

first computer readable program code means for causing the computer to deploy 
a plurality of intelligent agents within a primary computing environment, said primary 
computing environment including a primary server executing an application, and 
wherein each of said plurality of intelligent agents monitors a metric related to said 
application; 

second computer readable program code means for causing the computer to 
monitor a plurality of states, each of said plurality of states being rendered by one of 
said plurality of intelligent agents; 

third computer readable program code means for causing the computer to 
display to the user, via a graphical user interface, said plurality of states; and 

fourth computer readable program code means for causing the computer to 
perform a failure switch-over from said primary computing environment to a secondary 
computing environment having a secondary server capable of executing said 
application in response to an input received from the user via said graphical interface. 

24. The computer program product of Claim 23, wherein said application 
is an electronic mail application, and further comprising: 

fifth computer readable program code means for causing the computer to 
temporarily change the hostname of said secondary server to the hostname of said 
primary server. 
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25 . The computer program product of Claim 23, wherein said first computer 
readable program code means comprises: 

fifth computer readable program code means for causing the computer to query 
said application once every pre-determined time period in order for each said plurality 
5 of intelligent agents to monitor said corresponding metric related to said application. 



4074901.1 



-43- 



